378 research outputs found

    CMS-RCNN: Contextual Multi-Scale Region-based CNN for Unconstrained Face Detection

    Full text link
    Robust face detection in the wild is one of the ultimate components to support various facial related problems, i.e. unconstrained face recognition, facial periocular recognition, facial landmarking and pose estimation, facial expression recognition, 3D facial model construction, etc. Although the face detection problem has been intensely studied for decades with various commercial applications, it still meets problems in some real-world scenarios due to numerous challenges, e.g. heavy facial occlusions, extremely low resolutions, strong illumination, exceptionally pose variations, image or video compression artifacts, etc. In this paper, we present a face detection approach named Contextual Multi-Scale Region-based Convolution Neural Network (CMS-RCNN) to robustly solve the problems mentioned above. Similar to the region-based CNNs, our proposed network consists of the region proposal component and the region-of-interest (RoI) detection component. However, far apart of that network, there are two main contributions in our proposed network that play a significant role to achieve the state-of-the-art performance in face detection. Firstly, the multi-scale information is grouped both in region proposal and RoI detection to deal with tiny face regions. Secondly, our proposed network allows explicit body contextual reasoning in the network inspired from the intuition of human vision system. The proposed approach is benchmarked on two recent challenging face detection databases, i.e. the WIDER FACE Dataset which contains high degree of variability, as well as the Face Detection Dataset and Benchmark (FDDB). The experimental results show that our proposed approach trained on WIDER FACE Dataset outperforms strong baselines on WIDER FACE Dataset by a large margin, and consistently achieves competitive results on FDDB against the recent state-of-the-art face detection methods

    Towards a reliable face recognition system.

    Get PDF
    Face Recognition (FR) is an important area in computer vision with many applications such as security and automated border controls. The recent advancements in this domain have pushed the performance of models to human-level accuracy. However, the varying conditions in the real-world expose more challenges for their adoption. In this paper, we investigate the performance of these models. We analyze the performance of a cross-section of face detection and recognition models. Experiments were carried out without any preprocessing on three state-of-the-art face detection methods namely HOG, YOLO and MTCNN, and three recognition models namely, VGGface2, FaceNet and Arcface. Our results indicated that there is a significant reliance by these methods on preprocessing for optimum performance

    Multi-view Face Detection Using Deep Convolutional Neural Networks

    Full text link
    In this paper we consider the problem of multi-view face detection. While there has been significant research on this problem, current state-of-the-art approaches for this task require annotation of facial landmarks, e.g. TSM [25], or annotation of face poses [28, 22]. They also require training dozens of models to fully capture faces in all orientations, e.g. 22 models in HeadHunter method [22]. In this paper we propose Deep Dense Face Detector (DDFD), a method that does not require pose/landmark annotation and is able to detect faces in a wide range of orientations using a single model based on deep convolutional neural networks. The proposed method has minimal complexity; unlike other recent deep learning object detection methods [9], it does not require additional components such as segmentation, bounding-box regression, or SVM classifiers. Furthermore, we analyzed scores of the proposed face detector for faces in different orientations and found that 1) the proposed method is able to detect faces from different angles and can handle occlusion to some extent, 2) there seems to be a correlation between dis- tribution of positive examples in the training set and scores of the proposed face detector. The latter suggests that the proposed methods performance can be further improved by using better sampling strategies and more sophisticated data augmentation techniques. Evaluations on popular face detection benchmark datasets show that our single-model face detector algorithm has similar or better performance compared to the previous methods, which are more complex and require annotations of either different poses or facial landmarks.Comment: in International Conference on Multimedia Retrieval 2015 (ICMR

    Stacking-fault energies for Ag, Cu, and Ni from empirical tight-binding potentials

    Full text link
    The intrinsic stacking-fault energies and free energies for Ag, Cu, and Ni are derived from molecular-dynamics simulations using the empirical tight-binding potentials of Cleri and Rosato [Phys. Rev. B 48, 22 (1993)]. While the results show significant deviations from experimental data, the general trend between the elements remains correct. This allows to use the potentials for qualitative comparisons between metals with high and low stacking-fault energies. Moreover, the effect of stacking faults on the local vibrational properties near the fault is examined. It turns out that the stacking fault has the strongest effect on modes in the center of the transverse peak and its effect is localized in a region of approximately eight monolayers around the defect.Comment: 5 pages, 2 figures, accepted for publication in Phys. Rev.

    Image Co-localization by Mimicking a Good Detector's Confidence Score Distribution

    Full text link
    Given a set of images containing objects from the same category, the task of image co-localization is to identify and localize each instance. This paper shows that this problem can be solved by a simple but intriguing idea, that is, a common object detector can be learnt by making its detection confidence scores distributed like those of a strongly supervised detector. More specifically, we observe that given a set of object proposals extracted from an image that contains the object of interest, an accurate strongly supervised object detector should give high scores to only a small minority of proposals, and low scores to most of them. Thus, we devise an entropy-based objective function to enforce the above property when learning the common object detector. Once the detector is learnt, we resort to a segmentation approach to refine the localization. We show that despite its simplicity, our approach outperforms state-of-the-art methods.Comment: Accepted to Proc. European Conf. Computer Vision 201

    Residual attention regression for 3D hand pose estimation

    Get PDF

    Development of a tight-binding potential for bcc-Zr. Application to the study of vibrational properties

    Get PDF
    We present a tight-binding potential based on the moment expansion of the density of states, which includes up to the fifth moment. The potential is fitted to bcc and hcp Zr and it is applied to the computation of vibrational properties of bcc-Zr. In particular, we compute the isothermal elastic constants in the temperature range 1200K < T < 2000K by means of standard Monte Carlo simulation techniques. The agreement with experimental results is satisfactory, especially in the case of the stability of the lattice with respect to the shear associated with C'. However, the temperature decrease of the Cauchy pressure is not reproduced. The T=0K phonon frequencies of bcc-Zr are also computed. The potential predicts several instabilities of the bcc structure, and a crossing of the longitudinal and transverse modes in the (001) direction. This is in agreement with recent ab initio calculations in Sc, Ti, Hf, and La.Comment: 14 pages, 6 tables, 4 figures, revtex; the kinetic term of the isothermal elastic constants has been corrected (Eq. (4.1), Table VI and Figure 4

    Tweeting Cameras for Event Detection

    Full text link

    A novel infrared video surveillance system using deep learning based techniques

    Get PDF
    This is the author accepted manuscript. The final version is available from Springer via the DOI in this record.This paper presents a new, practical infrared video based surveillance system, consisting of a resolution-enhanced, automatic target detection/recognition (ATD/R) system that is widely applicable in civilian and military applications. To deal with the issue of small numbers of pixel on target in the developed ATD/R system, as are encountered in long range imagery, a super-resolution method is employed to increase target signature resolution and optimise the baseline quality of inputs for object recognition. To tackle the challenge of detecting extremely low-resolution targets, we train a sophisticated and powerful convolutional neural network (CNN) based faster-RCNN using long wave infrared imagery datasets that were prepared and marked in-house. The system was tested under different weather conditions, using two datasets featuring target types comprising pedestrians and 6 different types of ground vehicles. The developed ATD/R system can detect extremely low-resolution targets with superior performance by effectively addressing the low small number of pixels on target, encountered in long range applications. A comparison with traditional methods confirms this superiority both qualitatively and quantitativelyThis work was funded by Thales UK, the Centre of Excellence for Sensor and Imaging System (CENSIS), and the Scottish Funding Council under the project “AALART. Thales-Challenge Low-pixel Automatic Target Detection and Recognition (ATD/ATR)”, ref. CAF-0036. Thanks are also given to the Digital Health and Care Institute (DHI, project Smartcough-MacMasters), which partially supported Mr. Monge-Alvarez’s contribution, and to the Royal Society of Edinburgh and National Science Foundation of China for the funding associated to the project “Flood Detection and Monitoring using Hyperspectral Remote Sensing from Unmanned Aerial Vehicles”, which partially covered Dr. Casaseca-de-la-Higuera’s, Dr. Luo’s, and Prof. Wang’s contribution. Dr. Casaseca-de-la-Higuera would also like to acknowledge the Royal Society of Edinburgh for the funding associated to project “HIVE”

    Probabilistic Computation in Human Perception under Variability in Encoding Precision

    Get PDF
    A key function of the brain is to interpret noisy sensory information. To do so optimally, observers must, in many tasks, take into account knowledge of the precision with which stimuli are encoded. In an orientation change detection task, we find that encoding precision does not only depend on an experimentally controlled reliability parameter (shape), but also exhibits additional variability. In spite of variability in precision, human subjects seem to take into account precision near-optimally on a trial-to-trial and item-to-item basis. Our results offer a new conceptualization of the encoding of sensory information and highlight the brain’s remarkable ability to incorporate knowledge of uncertainty during complex perceptual decision-making
    • …
    corecore